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Abstract 



> 
OO 

' By means of a binary visibility graph, we present a novel method to study random binary 

CN 

, sequences. The behavior of the some topological properties of the binary visibility graph, such as 

o ■ 

' the degree distribution, the clustering coefficient, and the mean path length have been investigated. 



Several examples are then provided to show that the numerical simulations confirm the accuracy 
of the theorems for finite random binary sequences. Finally, in this paper we propose, for the first 



■ time, three topological properties of the binary visibility graph as a randomness criteria. 



1 Introduction 

The relationship between time series analysis and complex networks have emerged [1, 2]. Zhang et 
al. introduced a method of mapping between time series and complex networks, they found that, the 
dynamics of time series are encoded into the topology of the corresponding network [3, 4]. Lacasa et 
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al. have proposed an alternative mapping between time series and complex networks based on the 
visibility graph algorithm, they are able to discriminate uncorrected randomness from chaos series 
[5, 6]. 

Recently, complex network theory has stimulated explosive interests in the study of social, informa- 
tional, technological and biological systems, resulting in a deeper understanding of complex systems 
[7, 8, 9, 10]. We apply visibility algorithm as a new method for random binary sequences analysis, 
which converts binary sequences into complex networks. Whereas the previous works [11, 3] were 
focused on the dynamics of a complex system is usually recorded in the form of time series, which 
can be studied through its visibility graph from a complex network perspective. The intent of this 
paper is to propose a new binary visibility graph (BVG) which stands as a subgraph of the visibility 
graph. The rest of the paper is organized as follows. In Sec. II we introduce the BVG algorithm. 
In Sec. Ill we derive exact results for topological properties of the BVG such as degree distribution, 
local clustering coefficient, long distance visibility, we propose, for the first time, three topological 
properties of the BVG as a randomness criteria. This section is followed by an outlook section. 

2 Construction of BVG 

We start with the description of the visibility graph. By considering an arbitrary sampled time series 
{ut :t=l,2, N}. Each data point of the time series is encoded into a node of the visibility graph. 
Two arbitrary data points Ui and Uj in the time series have visibility, and consequently become two 
nodes in the associated graph, if any other data point Uk such that i < k < j fulfills. 

Uk < Ui + {ui - Uj)^—^ . (2-1) 
J ^ 

An example of a time series containing 20 data points and the associated visibility graph derived from 
the visibility algorithm is illustrated in (Fig.l). By definition, any visibility graph extracted from a 
time series is always connected since each node see, at least its nearest neighbors and the degree of 
any node ut with 1 < t < iV is more than 2. Furthermore, the constructed graph inherits several 
properties of the series in its structure. Therefore, periodic series convert into regular graphs, random 
series convert into irregular random graphs and fractal series do so into scale-free networks [5]. It 
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is also found that a visibility graph is invariant under affine transformation of the series data since 
the visibility criterion is invariant under rescaling of both horizontal and vertical axes, and under 
horizontal and vertical transformation [12]. 

The BVG is an algorithm that maps a binary sequence into a graph (as shown in Fig.2). Here, we 
briefly describe the binary visibility algorithm in the following way: 

Let {xi}i=i^,,,^N be a binary sequence of N bits. The algorithm assigns each bit of the binary sequence 
to a node in the BVG the algorithm is abbreviated as EVA . Two nodes i and j in the BVG are 
connected if one can draw a visibility line in the binary sequence joining Xj and xj that does not 
intersect any intermediate bits height. Xi{xj) can only be and 1. Therefore, i and j are two 
connected nodes if the succeeding geometrical criterion is satisfied with the binary sequence: 

Xi + Xj > Xfi that Xn = for all n such that i < n < j . (2-2) 

It is important to note that, given a binary sequence, its BVG is a subgraph of its associated 
visibility graph, consequently, as in the former case, the BVG associated with a binary sequence 
is always connected and undirected, since, each node sees at least its first neighbors (left-hand and 
right-hand). In what follows we will show that the simplicity of the binary version of the algorithm 
allows analytical solvability and geometrically simpler, this new method can attest to distinguish 
between random and non-random binary sequences. 

3 Topological properties of the BVG 

In order to investigate some statistical characteristics of the binary sequences, the following assump- 
tions are made with respect to random binary sequences to be tested: 

Uniformity: The occurrence of zeros and ones are of equal probabilities, i.e. if a sequence is of 
length n, the expected number of ones (or zeros) is n/2. 

Scalability: Any subsequences should have the same statistic characters with the sequence they 
randomly extracted from, i.e. any test applicable to a sequence can also be applied to the 
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subsequences. 

Consistency: The behavior of a generator must be consistent across starting values (seeds). 

Under above framework, The National Institute of Standards and Technology (NIST) statistical tests 
suite (which can be freely down- loaded from website http://csrc.nist.gov/rng/) for random binary 
sequences offers a battery of sixteen statistical tests [13]. In the following three subsections we will 
present three intuitive interpretations of the topological properties of the BVG . 

3.1 Degree distribution 

Let us consider a bi-infinite binary sequence created from a binary valued random variable X(with 
X as its values) such that x G {0, 1}. For simplicity, we will label a generic bit Xo as the "seed" bit 
here after. In order to obtain the degree distribution P{k) [14] of the associated graph, we are going 
to estimate the probability of an arbitrary bit having xq value which can be observe, k other bits. If 
k bits are observed by xq, there will be encounter with two bounding bits with values on each side, 
one on the right-hand side of xq and the other on its (L.H.S). So that the k — 2 visible bits will be 
located in that window, i.e. they are zeros. This implies the minimum possible degree is A; = 2. 
As these "inner" bits should appear sorted by its position from seed(being on the left or right side 
if depending in the position of the seed). Hence we can say that there are exactly k — 1 different 
possible configurations {Ci}i=o,...,fc-25 where the index i determines the number of inner bits on the 
right-hand side of xo(sec Fig. 3). It should be mentioned that the case where k = A and xq = is 
an exception, since the seed is always in between two inner bits. In this paper, for a more exacting 
analysis, we study the cases xq = and xq = 1, separately. 

We are calculated for the first example a set of possible configurations for a seed bit xq with k = A 
result denoted in Fig. 3. As it is observe the sign of the subindex in Xj depending bit is whether, it 
is located at the (L.H.S)or (R.H.S) of xq. Therefore, the boundings bits subindex directly indicates 
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the amount of bits located in that side. As an example, in xq = 1, Co is the configuration where 
none of the k — 2 = 2 inner bits are located in the (L.H.S) of xq, and hence the left bounding bits 
are labeled as x_i and the right bounding bits are labeled as X3 . For xq = 0, Co is the configuration 
where one of the k — 2 = 2 inner bits are located in the (L.H.S) of xq, and therefore the left bounding 
bits are labeled as X-2 and the right bounding bits are labeled as Xn+i- Note that n hidden bits can 
be located in the (R.H.S) of the inner bit. In xq = 1, Ci is the configuration for which inner bits 
are located in the (L.H.S) of xq and another inner bits are located in its (R.H.S). For xq = 0, Ciis 
the configuration for which nl hidden bits arc located in the (L.H.S) of xq and n2 hidden bits are 
located in its (R.H.S). Finally, in xq = 1, C2 is the configuration for which both inner bits are located 
in the (L.H.S) of the seed. For xq = 0, C2is the configuration where one of the k — 2 = 2 inner bits 
are located in the (R.H.S) of a;o, and therefore the right bounding bits are labeled as X2 and the left 
bounding bits are labeled as a;_(„_|_i). Notice that n hidden bits can be located in the (R.H.S) of the 
inner bit (see Fig. 3). 

Consequently, Q corresponds to the configuration for which i inner bits are placed at the (R.H.S) of 

Xq, and k — 2 — i inner bits arc placed at its (L.H.S). Each of these possible configurations have an 
associated probability pi = p{Ci) that will result in P{k) such that 

P{k) = Y,n. (3-3) 

Now, the calculation of a general relation for P{k) should be done in the following steps: 
In the first step, we are going to perform to calculation of Eq.(3), for k = 2,i.e. the probability that 
the seed bits have two and only two visible bits. These obviously will be the bounding bits that we 
will label x_i and xi for (L.H.S) and (R.H.S) of the seed, respectively. For > 2, by taking into 
account the total probability that xq sees is 1. Because of any bit in the introduced binary visibility 
algorithm (sec. 2), sees at least its first neighbors. Now, let us look at the particular case for Eq.(3), 
taken at k = 2: 
For Xq = 0: 

p{xo = 0) = Prob{xi,X-i = 1) = 
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For xq = 1: 

p{xo = 1) = Prob{xi,x-i = 1) = ^ 

Then, 

Pik = 2) = p{xo = 0) + p{xo = 1) = ^ (3-4) 

In this step, we are going to perform to calculation of Eq.(3), for k = 3,i.e. for the seed which has 
three and only three observable bits. In this process, we encounter with two different configurations 
: Co, in which xq has two bounding visible bits and X2, respectively) and a (R.H.S) inner bit 
(xi, and the same for Ci but with the inner bit being placed at the (L.H.S) of the seed; so 

P{k = 3) = p{Co) +piCi) =po+Pi 

Note that at this point for xq = 0, an arbitrary number n of hidden bits bi,b2, ■ ■ ■ ,bn can eventually 
be located between the inner and the bounding bits, and this fact needs to be taken into account in 
the probability calculation. The geometrical restrictions for the bj hidden bits are bj = 0( j=l,- • •,n) 
for Co and dj = 0( j=l,- • -,71') for Ci. Then, 

Po{xo = 0) = Prob[{xn+i,x-i = 1)0 {{bj = Q}j=i,...,n)], 

Pi{xo = 0) = Pro6[(x_(„/+i),xi = 1) n {{dj = 0}j=i,...,„/)], 

At this stage we have to consider all the hidden bits totally configurations (Co without hidden bits, 
Cq with a single hidden bit, Co with two hidden bits, and so on, and the same for Ci). With a little 
calculation, one obtains 

Po{xo = 0) = -[1 + Y.iI[p(^M = ^ (3-5) 

n=2 j=2 

where the first term in the square bracket in Eq. (5) corresponds to the contribution of a configuration 
with no hidden bits and the second sums over the contributions of n hidden bits. 

2 

Po{xo = 1) = Prob[{x2,X-i = 1) n {xi = 0)] = — , 

For a similar result pi can be find. As a consequence of this similarity the configurations are 
symmetrical for be Cq, Ci. Ultimately, one gets 

P{k = 3) = 2{po{xo = 0) + po{xo = 1)) = ^ (3-6) 



Binary visibility graph 



7 



To continue the evaluation, we need to calculate the contributions due to the Eq.(3), for = 4 , 
i.e. for the seed which has four and only four observable bits. For xo = l(xo = 0), we encounter 
with three different configurations: Co, in which xq has two bounding visible bits x-i, xs{x-2,Xn+i) 
respectively and two (R.H.S) inner bits xi, a;2(a;_i, xi) and the same for Ci but with the inner bits 
being place at the (L.H.S) of the seed; so 

P{k = 4) = p{Cq) + p{Ci) + p{C2) ^PQ+Pl+P2 

Note at this point that for xq = 0, an arbitrary number n of hidden bits bi,b2, ■ ■ ■ ,bn can even- 
tually be located between the inner and the bounding bits, and this fact needs to be taken into 
account in the probability calculation. The geometrical restrictions for the bj{bi) hidden bits are 
bj = 0(j=l,- • •,n2)[6j = 0( i=-l,- • Tnl)] for Co and the same for Ci,C2. Then, 



Poixo = 0) = Prob[{Xn2+l:X_(^nl+l) = 1) H {{bj = 0}j=i,...,„2) H ({6, = 0}i=_i,... _„i)] , 

Now, we need to consider every possible hidden bits configuration (Co without hidden bits, Co with 
a single hidden bit, Co with two hidden bits, and so on, and the same for Ci,C2). With a little 
calculation, one obtains 

1 oo n n 

Poixo = 0) = -[1 + 2 5: (n PinM = - (3-7) 

n=2 j=2 

where the first term in the square bracket in Eq.(7) corresponds to the contribution of a configuration 
with no hidden bits and the second sums over the contributions of nl and n2 hidden bits. 

Po{xo = 1) = Prob[{x3, x_i = 1) n (xi = 0) n {x2 = 0)] = 

We obtain similar results for Pi(p2) and consequently the configuration provided by Ci(C2) is sym- 
metrical to the one provided by Co. Ultimately, one gets 

P{k = 4) = 3{po{xo = 0) + Poixo = 1)) = ^ (3-8) 

Let us proceed by tackling the case P{k = 5), that is, the probability that the seed has five and only 
five visible bits. Four different configurations arise: Co, in which xq has two bounding visible bits 
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X-i, X4 respectively and three right-hand side inner bits xi,X2,X2 and the same for Ci,C2,C3 but 
with the inner bits being place at the left-hand side of the seed; so 



Then, 



P{k = b) = p{Co) + p{Ci) + p{C2) + piCs) = po + pi + P2 + P3, 



Po{xo = 1) = Prob[{x4,x-i = 1) n {xi = 0) n {x2 = 0) n {x3 = 0)] = 



64' 



We can find an identical result for ^1(^2,^3) and consequently the configuration provided by Ci(C2,C3 
is similar to the one provided by Cq. Ultimately, one gets 

4 



P{k = 5) = 4po(xo = 1) 



The results of the present calculations are summarized: 



64 



(3-9) 



P{xo = 0) = < 



Pixo = 1) 



Therefore, we can argue that, for > 5: 



32 k — 2 

^ A; = 3,4 

k>5 

(k-l) 



2k+l 



P{k) = 



k>2 



2fe+i 



But, in general. 



P{k) = P{xo = 0) + P{xo = 1) 



(3-10) 



(3-11) 



We can achieve that, the degree distribution P{k) of the associated BVG has the semiexponential 
form. 

The values of goodness-of-fit test between the theoretical prediction degree distribution Eq. (11) 
and numerical results demonstrated the measure of uniformity. In order to confirm further the 
accuracy of our analytical results for the case of finite binary sequences, we have performed several 
numerical simulations. We have generated random binary sequences of 10^ bits and their associated 
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BVG. In Fig. 4 we have plotted the degree distribution of the resulting graphs (triangles correspond 
to a sequence extracted from a CCCBG tent map [15], while circles correspond to one extracted from 
a CCCBG logistic map [15, 16], respectively). The line is the best fit of the theoretical , showing a 
perfect agreement with the numerics. 

3.2 Local clustering coefficient distribution 

By means of geometrical arguments, we can obtain the local clustering coefficient C [8, 7, 17, 18, 14] 
of a BVG associated with a binary sequence. For a reference node i, C means the rate of nodes 
connected to i that are connected between each other, where C represents the clustering. In other 
words, we have to work out from a reference node i how many nodes from those visible to i have 
mutual visibility (triangles), normalized with the set of possible triangles (2). In a first step, if a 
generic node i has degree k = 2, these nodes are straightforwardly two bounding bits, hence having 
mutual visibility. Hence , in this condition there exists one triangle and C{k = 2) = 1. Now if a 
generic node i has degree k = 3{k = 4,5), one (two,three) of its neighbors will be an inner bit(two, 
three bits), which will only have visibility of one of the bounding bits (by construction). We achieve 
that in this condition we can only form three (five,five) triangles out of three(six,ten) possible ones, 
thereby: 

^ 1 k = 2,3 



Cik) 



I k = A (3-12) 



^ fe>5 



This relation between k and C for A; > 5 allows us to deduce the local clustering coefficient distribution 
P{C) as follows: 

Where /(C) = (C^ - 32C + 16)i In general. 



f(c)-c+A n ^ r < A 

5C-|-4+/(C) - U ^ ^ 
C2 2C 

^(C) = <! A c=l (3-13) 
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To confirm the vafidity of this latter relation within finite binary sequences, in Fig. 5 we illustrate 
the clustering distribution of a BVG associated with a random binary sequence of 10^ bits (circles) 
obtained numerically. The line is the best fit of the theoretical and triangles corresponds to the 
theoretical prediction (C = 1, |), in excellent agreement with the numerics. 

The values of goodness-of-fit test between the theoretical prediction clustering distribution Eq. 
(13) and numerical results demonstrated the measure of consistency. 

3.3 Long distance visibility, mean degree, mean path length 

The mean path length scaling [14], can be derived as below, let us first estimate the probability P{n) 
that two bits separated by n intermediate bits be two connected nodes in the graph. By taking into 
account a binary sequence to construct associated BVG. An arbitrary xq = 1 from the mentioned 
sequence can be "observe" a;„ = 1 (and therefore would be connected to node Xn in the graph) if and 
only if = for all (z = 1, 2, n — 1). Then P{n) may be estimated as 

P(n) = prob[{xo, Xn = l)n {{xi = 0}i=i,...,„_i)] = ^ (3-14) 

Now, we can derive the mean degree < > of the binary visibility graph as follows: 

< A; >= ^ kP{k) = 3.5, (3-15) 

which we can be obtained from P(n) as 

oo 

<k>= 3.5 J2 Pip) = (3-16) 

n=l 

At this point, in the Fig. 6 to illustrate the adjacency matrix [14] of the BVG associated with a 
random binary sequence of 500 bits ( if nodes i and j are connected, then the entry i , j are filled 
in black and otherwise they are filled, blank ). Since every bit xi has visibility of its first neighbors 
jXj+i, every node i will be connected by construction to nodes i — 1 and i + 1: the graph is 
thus connected. The Fig. 6 indicates that the graph is very to exact homogeneous structure,i.e. 
the adjacency matrix is exactly filled around the main diagonal. Moreover, the matrix evidences a 
superposed compact structure, noticeably the visibility probability P{n) = ^ that introduces some 
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shortcuts in the BVG, much in the vein of the small- world model [12]. Here, the P(n) denotes, the 
shortcuts probability.From the Statistical point of view, we can interpret the graphs structure as 
nearly homogeneous, where by increasing the size of graphs, the size of the local neighborhood do 
not change. Hence, we can approximate its mean path length L{N) as 

L{N) « Y: ^nn) = E ^ = 2(1 - (3-17) 

n=l n=l 

It is observe that, the logarithmic scaling emerged , denoting that the BVG associated with a generic 
random sequence is small world [12], which may be observed in the Fig. 5. The numerical re- 
sults of L{N) (circles) of a BVG associated with several random binary sequences of increasing size 
N = 2^,2^, ...,2^*^ in the Fig. 7, have been plotted . The line is the best fit of the theoretical. The 
values of goodness-of-fit test between the theoretical prediction mean path length Eq. (17) and 
numerical results demonstrated the measure of scalability. 

4 Conclusion and outlook 

In this article, we have investigated the binary visibility graph, constructed from the random binary 
sequences. The present study illustrates the uselessness of the previous works in the analysis of 
random binary sequences [19, 20, 6]. We have also evaluated exact results on several topological 
properties of the BVG associated with generic uncorrelated random binary sequences, and numerical 
simulations confirmed its reliability for finite sequences, and the results show the three topological 
properties of the binary visibility graph as a excellent randomness criteria. 

Furthermore, we do hope that our obtained results through this paper will pave the way for further 
studies on nonlinear dynamical systems. 
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